Voice Biometrics by García-Mateo Carmen;Chollet Gérard;
Author:García-Mateo, Carmen;Chollet, Gérard;
Language: eng
Format: epub
Publisher: Institution of Engineering & Technology
Published: 2021-07-30T16:00:00+00:00
5.5 Evaluation corpora
Two corpora were used to evaluate the performance of the speaker de-identification approaches described in Section 5.4. One of them includes many speakers with few training data, whereas the other includes few speakers with more training data. The purpose of using these two corpora is evaluating the performance of those speaker de-identification techniques according to the amount of training data, and also assessing the relevance of having more or less speakers available as source speakers for speaker-independent de-identification.
The first dataset is the Voice Cloning Toolkit (VCTK) Corpus [88]. VCTK Corpus was designed for speech synthesis applications, but its characteristics make it suitable for VC (and therefore de-identification) purposes. This corpus includes more than 100 speakers with various accents3 who recorded around 400 utterances each. These utterances include the Rainbow Passage [89], an elicitation paragraph, and sentences selected from a newspaper, the latter being different for each speaker. Since the techniques used in these experiments require a parallel corpus for training VC functions, the elicitation paragraph plus the Rainbow Passage were used for training, whereas the newspaper sentences were used for testing. It must be noted that not all the training sentences are available for all the speakers, so the amount of training utterances for each speaker differs slightly.
The second dataset used in these experiments is that of the Voice Conversion Challenge 2016 (VCC 2016) [90]. This corpus is based on the Data and Production Speech dataset [91], a freely available corpus recorded by professional US English speakers in a recording studio. Specifically, the âcleanâ version of the dataset was used in VCC 2016, and it includes around 13 min of speech, which were split into train and test sets, from each of the ten speakers that were selected for this corpus. The utterances include sentences from public domain books (novels such as Aliceâs Adventures in Wonderland, Twenty Thousand Leagues Under the Seas, and Treasure Island, among others). All the speakers recorded the same sentences, so there is a parallel corpus for every pair of speakers.
Some statistics of VCTK and VCC 2016 corpora, as used in these experiments, are summarized in Table 5.2.
Table 5.2 Experimental framework
VCTK VCC 2016
# of speakers 109 10
Average # of training utterances 23 162
Average # of test utterances 383 54
Average duration training utterances (per speaker) 2 min 30 s 9 min 41 s
Average duration test utterances (per speaker) 21 min 44 s 2 min 57 s
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8309)
Test-Driven Development with Java by Alan Mellor(6773)
Data Augmentation with Python by Duc Haba(6688)
Principles of Data Fabric by Sonia Mezzetta(6435)
Learn Blender Simulations the Right Way by Stephen Pearson(6335)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6209)
Hadoop in Practice by Alex Holmes(5965)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5813)
RPA Solution Architect's Handbook by Sachin Sahgal(5605)
Big Data Analysis with Python by Ivan Marin(5387)
The Infinite Retina by Robert Scoble Irena Cronin(5298)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5154)
Pretrain Vision and Large Language Models in Python by Emily Webber(4352)
Infrastructure as Code for Beginners by Russ McKendrick(4116)
Functional Programming in JavaScript by Mantyla Dan(4042)
The Age of Surveillance Capitalism by Shoshana Zuboff(3961)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3832)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3632)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3605)
